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Abstract 



Motivated by the problem of routing reliably and scalably in a graph, we introduce the notion 
of a splicer, the union of spanning trees of a graph. We prove that for any bounded-degree n- 
p^ ' vertex graph, the union of two random spanning trees approximates the expansion of every cut 

of the graph to within a factor of 0(logT7,). For the random graph Gn,p, for p > clogn/n, two 
spanning trees give an expander. This is suggested by the case of the complete graph, where 
we prove that two random spanning trees give an expander. The construction of the splicer is 
0^ ' elementary — each spanning tree can be produced independently using an algorithm by Aldous 

and Broder: a random walk in the graph with edges leading to previously unvisited vertices 
included in the tree. 

A second important application of splicers is to graph sparsification where the goal is to 

approximate every cut (and more generally the quadratic form of the Laplacian) using only a 

c/3 I small subgraph of the original graph. Benczur-Karger |5j as well as Spielman-Srivastava [22] 

vJ^i have shown sparsifiers with 0(nlogn/e^) edges that achieve approximation within factors 1 + e 

and 1 — £. Their methods, based on independent sampling of edges, need O(nlogn) edges to 

get any approximation (else the subgraph could be disconnected) and leave open the question of 

^ I linear-size sparsifiers. Splicers address this question for random graphs by providing sparsifiers 

^"^ ■ of size 0{n) that approximate every cut to within a factor of O(logn). 

1 Introduction 

o 

QQ ' In this paper, we present a new method for obtaining sparse expanders from spanning trees. This 

^D ■ appears to have some interesting consequences. We begin with some motivation. 

Recovery from failures is considered one of the most important problems with the internet today 
K^ \ and is at or near the top of wish-lists for a future internet. In his 2007 FCRC plenary lecture, 

^ • Shenker desires a network where "even right after failure, routing finds path to destination" |21j . 



How should routing proceed in the presence of link or node failures? 

At a high-level, to recover from failures, the network should have many alternative paths, 
a property sometimes called path diversity, which is measured by several parameters, including 
robustness in the presence of failures and congestion. It is well-known that expander graphs have 
low congestion and remain connected even after many (random) failures. Indeed, there is a large 
literature on routing to minimize congestion and on finding disjoint paths that is closely related to 
expansion (or more generally, conductance); e.g. [20 1 flTt lB]. 

However, in practice, efficient routing also needs to be compact and scalable; in particular, the 
memory overhead as the network grows should be linear or sublinear in the number of vertices. This 
requirement is satisfied by routing on trees, one tree per destination. In fact, the most commonly 
used method in practice is shortest path routing which is effectively one tree per destinatioro. Since 



^It is called Open Shortest Path First (OSPF) in networking terminology. 



the final destination determines the next edge to be used, this gives an 0{n) bound on the size of 
the routing table that needs to be stored at each vertex. If a constant-factor stretch is allowed, this 
can be reduced. For example, with stretch 3, tables of size 0{y/n) suffice as shown by Abraham et 
aim. 

The main problem with shortest-path routing or any tree-based scheme is the lack of path 
diversity. Failing any edge disconnects some pairs of vertices. Recovery is usually achieved by 
recomputing shortest path trees in the remaining network, an expensive procedure. Further, con- 
gestion can be high in principle. This is despite the fact that the underlying graph might have high 
expansion, implying that low congestion and high fault-tolerance are possible. There is some evi- 
dence that AS-level internet topologies are expanders and some stochastic models for networks lead 
to expanders [H]. However, known algorithms that achieve near-optimal congestion use arbitrary 
paths in the network and therefore violate the scalability requirement. This raises the following 
question: is it possible to have a routing scheme that is both scalable and achieves congestion and 
fault-tolerance approaching that of the underlying graph? 

Our work is inspired by Motiwala et al. [16^ \TE\ , who consider a conceptually simple extension 
of tree-based routing, using multiple trees. With one tree there is a unique path between any two 
points. With two trees, by allowing a path to switch between the trees multiple times, there could 
be a large number of available paths. Motiwala et al. showed experimentally that a small number 
of randomly perturbed shortest path trees for each destination leads to a highly reliable routing 
method: the union of these trees has reliability approaching that of the underlying graph. 

This raises the question whether the results of this experiment can be true in general. I.e., for 
a given graph does there exist a small collection of spanning trees such that the reliability of the 
union approaches that of the base graph? As a preliminary step, we study the question of whether 
for a given graph the union of a few spanning trees captures the expansion of the original graph. 
Here we propose a construction that uses only a small number of trees in total (as opposed to one 
tree per destination) and works for graphs with bounded degree and for random graphs. The trees 
are chosen independently from the uniform distribution over all spanning trees, a distribution that 
can be sampled efficiently with simple algorithms. The simplest of these, due to Aldous [2j and 
Broder [6], is to take a random walk in the graph, and include in the tree every edge that goes to a 
previously unvisited vertex. Roughly speaking, our main result is that for bounded degree graphs 
and for random graphs a small number of such trees give a subgraph with expansion comparable 
to the original graph for each cut. 

A second important application of splicers is to graph sparsification where the goal is to approx- 
imate every cut (and more generally the Laplacian quadratic form) using only a small subgraph of 
the original graph. Benczur-Karger [5] as well as Spielman-Srivastava [22J have shown sparsifiers 
with 0(nlogn/e^) edges that achieve a 1 + e approximation. Their methods are based on inde- 
pendent sampling of edges with carefully chosen edge probabilities and require ri(nlogn) edges 
to get any approximation; with fewer edges, the subgraph obtained could be disconnected. They 
leave open the question of the existence of linear-size sparsifiers. Splicers, constructed using random 
spanning trees, provide sparsifiers of size 0{n) for random graphs: When the base graph is random, 
with high probability, the union of two spanning trees approximates all cuts to within a factor of 
O(logn). We state this precisely in the next section. 



%^(^)l>rT3:r-l'^G(^)|. 



1.1 Our results 

A A;-splicer is the union of k spanning trees of a graph. By a random /c-sphcer we mean the union 
of k uniformly randomly chosen spanning trees. We show that for any bounded degree graph, the 
union of two random spanning trees of the graph approximates the expansion of every cut of the 
graph. Using more trees gives a better approximation. In the following Sg{A) stands for the set of 
edges in graph G that have one endpoint in A, a subset of vertices of G. 

Theorem 1.1. For a d-regular graph G = {V,E), let Uq be a random k-splicer, obtained by the 
union of k uniformly random spanning trees. Also let a > be a constant and a{k — 1) > 9(i^. 
Then with probability 1 — o(l), for every A C V , we have 

1 

^S^ " ~ alogn 

Our proof of this makes novel use of a known property of random spanning trees of a graph, 
namely the events of an edge in the graph being included in the tree are negatively correlated. 

Next we give a lower bound, showing that the factor 1/logn is the best possible for /c-splicers 
constructed from random spanning trees for any constant k. 

Theorem 1.2. For every n, there is a bounded- degree edge expander G on n vertices such that 
with probability 1 — o(l) the edge expansion of a random k-splicer Uq is at most k'^/Glogn for any 
k>l. 

For the complete graph, one can do better, requiring only two trees to get a constant-factor 
approximation . 

Theorem 1.3. The union of two uniformly random spanning trees of the complete graph on n 
vertices has constant vertex expansion with probability 1 — o(l). 

Since constant vertex expansion implies constant edge expansion, we get that the union of two 
uniformly random spanning trees has constant edge expansion with high probability. 

Next we turn to the random graph Gn,p- Our main result here is that w.h.p., Gn,p has two 
spanning trees whose union has constant vertex expansion. We give a simple random process (called 
Process Bp henceforth) to find these trees. 

Theorem 1.4. There exists an absolute constant G, such that for p > Glogn/n, with probability 
1 — o(l), the union of two random spanning trees from Process Bp applied to a random graph H 
drawn from Gn,p has constant vertex expansion. 

The proof of this theorem is via a coupling lemma (Lemma 17. 2p showing that a tree generated 
by Process Bp applied to a random graph H is nearly uniform among spanning trees of the complete 
graph. 

Theorem 11.41 relates to the work of O [22] and leads to the first linear-size sparsifier with 
nontrivial approximation guarantees for random graphs: 

Theorem 1.5. Let p > Clogn/n for a sufficiently large constant G. Let H be a G{n,p) random 
graph, and let H' be the 2-splicer obtained from it via process Bp, with a weight of pn on every 
edge. Then with probability 1 — o{l), for every A C V we have 

ci\Sh{A)\ < w{6h'{A)) < C2\6H{A)\\ogn, 

where ci , C2 > are constants. 

Here w{-) denotes the sum of the weights. 



1.2 Related work 

The idea of using multiple routing trees and switching between them is inspired by the work of 
[16] who proposed a multi-path extension to standard tree-based routing. The method, called Path 
Splicing, computes multiple trees to each destination vertex, using simple methods to generate the 
trees; in one variant, each tree is a shortest path tree computed on a randomly perturbed set of 
edge weights. Path splicing appears to do extremely well in simulations, approaching the reliability 
of the underlying graph using only a small number of treeqj. 

Sampling for approximating graph cuts was introduced by Karger, first for global min-cuts and 
then extended to min s-t cuts and flows. The most recent version due to Benczur and Karger ^ 
approximates the weight of every cut of the graph within factors of 1+e and 1 — e using 0{n log n/e^) 
samples; edges are sampled independently with probability inversely proportional to a connectivity 
parameter and each chosen edge is weighted with the reciprocal of their probability. Recently, 
Spielman and Srivastava ^22j , gave a similar method where edges are sampled independently with 
probability proportional the graph resistance and weighted in a similar way, by the reciprocal of 
the probability with which they are chosen. They show that every quadratic form of the Laplacian 
of the original graph is approximated within factors 1 — e and 1 + e. The similarity in the two 
methods extends to their analysis also — both parameters, edge strength and edge resistance share 
a number of useful properties. 

It has long been known that the union of three random perfect matchings in a complete graph 
with even number of vertices (see, e.g., [9]) is an expander with high probability. Our result on 
the union of random spanning trees from the complete graph can be considered as a result in a 
similar vein, and our proof has a similar high-level outline. Still, the spanning trees case seems to 
be different and requires some new ideas. 

On the other hand, our result for the union of spanning trees of bounded degree graphs doesn't 
seem to have any analog for the union of matchings. Indeed, generating random perfect matchings 
of graphs is a highly nontrivial problem, the case of computing the permanent of 0-1 matrices being 
the special case for bipartite graphs [TO] . 

2 Preliminaries 

Let G = iy,E) be an undirected graph. For w C y, define T{v) := {u ^ V : {u,v) £ E}, the 
set of neighbors of v. For A C y, define r(^) := U.ueAr{v), and r'{A) := r{A) \ A. Finally, let 
5a{A) := {(n, w) £ E : u £ A,v ^ A}. The edge expansion of G is 

\Sg{A)\ 
mm 



The vertex expansion of G is 



ACV,l<\A\<\V\/2 \A\ 

\r'{A)\ 



mm 



ACV,l<\A\<\V\/2 \A\ 

We say that a family of graphs is an edge (resp., vertex) expander (family) if the edge (resp., 
vertex) expansion of the family is bounded below by a positive constant. 
Let Kn denote the complete graph on n vertices. 



^It has several other features from a practical viewpoint, such as allowing end vertices to specify paths, that we 
do not discuss in detail here. 



For a G M, let [a] = {z G N : 1 < i < a}. On several occasions we will use the inequality 

k 



a) < ^^^' 



3 Uniform random spanning trees 

Uniformly random spanning trees of graphs are fairly well-studied objects; see, e.g., [13]. In this 
section we describe properties of random spanning trees that will be useful for us. There are several 
algorithms known for generating a uniformly random spanning tree of a graph, e.g., [21 [6l \T9\ I13j . 
The algorithm due to Aldous and Broder is very simple and will be useful in our analysis: Start a 
uniform random walk at some arbitrary vertex of the graph, and when the walk visits a vertex for 
the first time, include the edge used to reach that vertex in the tree. When all the vertices have 
been visited we have a spanning tree which is uniformly random regardless of the initial vertex. 

A well-known fact (e.g. [12]) about uniform random spanning trees is that the probability that 
an edge e is chosen in a uniform random spanning tree, is equal to the effective resistance of e: Let 
each edge have unit resistance, then the effective resistance of e is the potential difference applied 
to the endpoints of e to induce a unit current. This fact shows a connection of our work with |22j . 
who sample edges in a graph according to their effective resistances to construct a sparsifier. 

For a connected base graph G = {V,E), random variable Tq denotes a uniformly random 
spanning tree of G. Uq will denote the union of k such trees chosen independently. For edge e £ E, 
abusing notation a little, we will refer to events e E E(Tq) and e G E{Uq) as e G Tq and e G Uq. 

Negative correlation of edges. The events of various edges belonging to the random spanning 
tree are negatively correlated: For any subset of edges ei, . . . , e^ G S we have 

P[ei,e2,...,efcGrG] < P[ei G Tg] P[e2 G Tg] • • • P[efe G Tg]. (1) 

A similar property holds for the complementary events: 

P[ei^TG,...,efc^TG] < P[ei ^ Tg] P[e2 ^ Tg] • • • P[e,. ^ Tg]. (2) 

These are easy corollaries of [13( Theorem 4.5], which in turn is based on the work of Feder and 
Mihail [8]. 

Negatively correlated random variables and tail bounds. For e G E, define indicator 
random variables X^ to be 1 if e G T, and otherwise. Then we can rewrite ^ as follows. 
For any subset of edges ei , . . . , e^ G i? we have 

E[Xe,---XeJ<E[XeJ---E[XeJ. (3) 

For random variables {X^} satisfying ([3]) we say that {X^} are negatively correlated. Several 
closely related notions exist; see Dubhashi and Ranjan [7J, and Pemantle [18]. [7| gave a property 
of negative correlation that will be useful for us: It essentially says that Chernoff 's bound for the 
tail probability for sums of independent random variables applies unaltered to negatively correlated 
random variables. More precisely, we will use the following version of Chernoff 's bound. 



Theorem 3.1. Let {Xj}"^-^ he a family of 0-1 negatively correlated random variables such that 
{l — Xi\'^^Y are also negatively correlated. Letpi be the probability that Xi = 1. Letp := ^ J2ie\n]Pi- 
Then forX>0 

ie[n] 

Proof. The proof sphts into two steps: In the first step we prove that for arbitrary A we have 

n n 

E[exp(A^X,)] < nE[exp(AX,)]. (4) 

The second step is a standard Chernoff bound argument as in the proof of Theorem A. 1.13 in [3]. 
Since the first step is not weh-known and is not hard, we provide a proof here. In this, we basicahy 
follow Dubhashi and Ranjan [7]. 

The case A = is trivially true. We now prove ([1]) for A > 0. Since Xj's take 0-1 values, for 
any integers oi, . . . , a^ > 0, we have X^^X^^ ■ ■ ■ X^" = X1X2 • • • X^. Now, writing exp(A J27=i ^i) 
using the Taylor series for e^, and expanding each summand, we get a sum over various mono- 
mials over the Xj's. For each monomial we have by the definition of negative correlation that 
E[Xi ■■■Xn]< nr=i E[Xi]. This gives Q for A > 0. 

For A < 0, a similar argument using 1 — Xi in the role of Xi gives (JH). D 

4 Expansion when base graph is a complete graph 

Our proof here has the same high-level outline as the proof for showing that the union of three 
random perfect matchings in a complete graph with even number of vertices is a vertex-expander 
(see, e.g., [9 1): One shows that for any given vertex set A of size < n/2, the probability is very small 
for the event that |r'(j4)| is small in the union of the matchings. A union bound argument then 
shows that the probability is small for the existence of any set A with |r'(A)| small. However, new 
ideas are needed because spanning trees are generated by the random walk process, which appears 
to be more complex to analyze than random matchings in complete graphs. 

Proof (of Theore7n \1.3\) . For a random spanning tree T in K^ and given A <^V, \A\ = a, we will 
give an upper bound on the probability that |r^(j4)| < ca, for a given expansion constant c (Recall 
that T'rp(A) denotes the set of vertices in V \ A that are neighbors of vertices in A in the graph 
T). To this end, we will fix a set A' <^ V \A oi size [ca\ and we will bound the probability that 
r^(^) ^ A' , and, to conclude, use a union bound over all possible choices of A and A' . Without loss 
of generality the vertices are labeled V = {1, . . . , re}, A = [a] = {1, . . . , a} and A' = {a + 1, [caj }. 
More precisely, the union bound is the following: the probability that there exists a set A C y such 
that \A\ < re/2 and |r^(yl)| < ca in the union of t random independent spanning trees is at most 



[n/2\ 

y , ,, 

\a/ \[ca 



a=l 



'P(TT{A)C[a + ca]y. (5) 



We will bound different parts of this sum in two ways: First, for a < re/12, we use the random 
walk construction of a random spanning tree which, as we will see, can be interpreted as every 
vertex in A picking a random neighbor (but not in a completely independent way). Second, for 

6 



a £ {n/12, n/2], we look at all the edges of the cut as if they were independent by means of negative 
correlation. 

So, for the first part of the sum in ([5]), a < n/12, consider a random walk on V, whose states 
are denoted (Xi,X2, • • • )) starting outside of A, that defines a random spanning tree (as in the 
random walk algorithm). Let tj be the first time that the walk has visited i different vertices of A. 
For i = 1, ..., a — 1, let Yi = tj+i — tj (the gap between first visits i and i + 1). We have that the 
random variables Yi are independent. Let Zi be the indicator of "1^ = 1", and let Z = Yli=i ^i be 
the number of adjacent first visits. We have 

P{Zi = l) = PiYi = l) 



n — 1 



We now give an upper bound to the probability that the predecessor to the first visit of vertex i in 
[a] is in [a + ca], given that this predecessor is not a first visit itself (in this case, the edge coming 
into i is within [a + ca]). That is, for 2 < i < a, 

r^/,^ r n,^ N \ca\+i — l ca + i — 1 

P X(,^)_i e a + ca] Zi.i = = ^ ^ < -— - 

^ ' n — a + 1 — 1 n — a + 1 — 1 

r^/^. r IN |ca| ca 

P(X(,^)_ie [a + ca]) = ^^< 



n — a n — a 



Thus, using edges added when the walk goes from V \ A to A and ignoring edges in the other 
direction: 



a-l 



^{Tt{A) C [a + ca]) = ^ P(rT(^) <Z[a + ca]]Z = z) P{Z = z) 

a + ca — j 



2 = 

a— 1 I a—z 
< 



En 



n — 7 
2=0 \j=i •' 



a—l / ^ 

a — 1 



sE 




2 = ^ 

a + ca f a + ca a ^ "^ 

+ - 

n \ n n 

^ , -2(1 + 0)0^" 

n 



We now use this in ([5]), for a < n/12. Let K = 2(1 + c). 

Ln/12j 



a=l 



n\ I n 

a J \[ca\ 



a=l 



Ln/12j 
a=l 

a=l 



a J \caJ \ n J 

a\a{t-l-c) 



at 



at 



n 



a(t-l-c) 



(where a 



\n/l2\ 



pl + C 



< 



aK*n-(*-i-^)/2 



+ 



a= [y^\ +1 

12*-i-=J 



a(t-l-c) 



1 - aK*12-(*-i-'=) 



which goes to as n — > oo when aK''/12^^^^'^ < 1, and this happens for t = 2 and a sufficiently 
smah constant c. 

For the rest of the sum in ([5]), a £ (n/12,n/2], we use negative correlation of the edges of a 
random spanning tree T (Section [3]) to estimate the probability that Tj-{A) C [a + ca]. Any fixed 
edge from Kn appears in T with probability 2/n. We have that Tt{A) C [a + ca] iff no edge between 
A and V\ [a-\-ca] is present in T, and negative correlation (Equation ^) implies that this happens 
with probability at most (1 - 2/n)"("-("+'="')). Thus, 



Ln/2J 

E 

a=[n/12j+l 



n\ I n 
a) \[ca\ 



P(rr(^) C [a + ca] 



Ln/2J 

El en \"' f C7t 
\ n J \ rn 

a=[n/12j+l 



/en\a /en\ca 



CO 



1-1 

n 



ia(n— (a+ca)) 



< n sup 

7G[l/12,l/2] V7 



7?! / \ C'yn 



r) \ t7n(ra-(l+c)7n)) 



< n sup 

7e[l/12,l/2] 



[eh) 



l+c \ 7" 



-2t7n(l-(l+c)7) 



For any fixed c > 0, the function 



n sup 

7e[l/12,l/2] 



/(7) 



(e/7) 



l+c 



cCe2t(l-(l+c)7) 



771 



(e/7) 



l+c 



gCg2t{l-(l+c)7) 

is convex for 7 > and hence the sup is attained at one of the boundary points 1/12 and 1/2, 
and the function is strictly less than 1 at these boundary points for t = 2 and a sufficiently small 
constant c. This implies that this sum goes to as n ^ 00. D 



5 Expansion when base graph is a bounded-degree graph: positive 
result 

In this section we consider graphs with bounded degrees. To simphfy the presentation we restrict 
ourselves to regular graphs; it is easy to drop this restriction at the cost of extra notation. We 
show that for constant degree graphs the edge expansion is captured fairly well by the union of a 
small number of random spanning trees. 

Proof (of Theorem \l.l\) . It follows by the random walk construction of random spanning trees that 
for any edge {u,v) £ E we have P[{u,v) G T] > l/d{u). To see this, note that if we start the 
random walk at vertex u then with probability l/d{u) the first traversed edge is {u,v), which then 
gets included in T. Thus for Ac V, we have that 



E[\6Ta{A)\]>--\5G{A)\. 



1 
d 



We would now like to use the above expectation result to prove our theorem. Recall the 
definition of random variables X^ from Section [3l For edge e £ E, X^ is the indicator random 
variable taking value 1 if e G T, and value otherwise. Thus we have |(5t(^)| = X^ee5 (A)-^<^- 
We want to show that X^eeScfA) -^e ^^ ^^^ much smaller than its expectation with high probability. 
Random variables X^ are not independent. Fortunately, they are negatively correlated as we saw 
in Section [3l which allows us to use Theorem 13.11 



Y, X,<p\5g{A)\-X 

eeSaiA) 



^ g-A2/(2p|5G(A)|) < ^-AV(2|5g(A)|)^ ^g^ 



where p is the average of P[Xe = 1] for e G 5g{A). Since P[Xe = 1] > 1/d for all edges e, we have 
p > 1/d, and for X = {p - l/{2d))\6G{A)\ we have 



1 . , ..„ \SgW\ 

2d' 



P[\6Ta{A)\<^\6G{A)\]<e-^S^. 



Which gives 



1 k\SQ(A)\ 

P[\6^,{A)\<-\6G{A)\]<e~^^. (7) 



Now we estimate the probability that there is a bad cut, namely a cut A such that \6jjk {A)\ = a 
and |(5g(A)| > oalnn. To do this we first look at cuts of size a in the first random tree, which have 
size at least aa In n in G (This step is necessary: the modified Chernoff bound that we use is only 
as strong as the independent case, and when edges are chosen independently one is likely to get 
isolated vertices; looking at the first tree ensures that this does not happen). In order to be bad, 
these cuts have to have small size in all the remaining trees. The probability of that happening is 
given by ([7]). The number of cuts in the first tree of size a is clearly no more than ("~ ) < ("), as 
there are ("" ) ways of picking a edges out of n — 1, although not all of these may correspond to 



valid cuts. Then, the probabihty that a bad cut exists is at most 



n/ In 71 , , n/ Inn 

El n\ (fc-l)o!alnn .^-^ / CTT. \ <* {k-l)aal 

a=l ^ ^ a=l 



n n 



n/lnn 

a=l 

n/lnn 

E 



(A;-l)alnn 
exp ( ( ln(en/a) ^ ) a 



exp ( ( ln(e/a) + il —2 — j Inn ) a 



a=l 

Choosing {k — l)a > 9d? makes the above sum o(l). D 

6 Expansion when base graph is a bounded-degree graph: nega- 
tive result 

Here we show that Theorem 11.11 is best possible up to a constant factor for expansion: 

Proof (of Theorem \1.!^) . We begin with a d-regular edge expander G' on n vertices with a Hamil- 
tonian cycle (such graphs are known to exist), where d > 2 is a fixed integer. Let < i < logn be 
an integer to be chosen later, and let H he a Hamiltonian path in G' . Subdivide H into subpaths 
Pi , . . . , P„/^ each of length £ (to keep the formulas simple we suppress the integrality issues here 
which are easily taken care of). 

For two subpaths Pi and Pj, we say that they interact if (Pi U r'(Pj)) n {Pj U T'{Pj)) / 0. Since 
G' is (i-regular, |r'(Pj)| < d£. So, any subpath can interact with at most (f£ other subpaths (this 
bound is slightly loose). Thus we can find a set / of -pj ■ n/£ paths among Pi, ... , P„/^, so that no 
two paths in / interact. 

We now describe the construction of G, which will be obtained by adding edges to G'. For each 
path P G /, we do the following. Add an edge between the two end-points of Pj, if such an edge 
did not already exist in G' . If the subgraph G[r'(Pj)] induced by the neighborhood of path Pj does 
not have a Hamiltonian cycle, then we add edges to it so that it becomes Hamiltonian. Clearly, in 
doing so we only need to increase the degree of each vertex by at most 2. The final graph that we 
are left with is our G. For each path P S / we fix a Hamiltonian cycle in G[r'(P)], and we also 
have the cycle of which P is a part. We denote these two cycles by Ci(P) and C2(P). 

We will generate a random spanning tree T of G by the random walk algorithm starting the 
random walk at some vertex outside of all paths in /. For P G /, we say that event Ep (over the 
choice of a random spanning tree T of G) occurs if the random walk, on first visit to Ci (P) U C2 (P) , 
first goes around Ci(P) without going out or visiting any vertex twice, and then it goes on to 
traverse C2(P), again without going out or visiting any vertex twice until it has visited all vertices 
in C2(P). For all P G / we have 

P[^P] > l/(d + 2)1^1(^)1+1^^(^)1-1 > i/(d + 2)('^+i)^-i. (8) 

If event Ep happens then in the resulting tree T we have \5t{V{P))\ = 1. Thus our goal will be to 
show that with substantial probability there is a P € / such that Ep happens. Since no two paths 
in / interact with each other, events Ep are mutually independent. If we are choosing k random 
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spanning trees, then define Ep to be the event that Ep occurs for all k spanning trees. Clearly, 
P[£'p] = P[ii^p] . Then the probability that Ep doesn't occur for any P G I is at most 



' 1 



n 
< exp 



{d + 2)'^(<i+i)^-'=+2^2 



It follows readily that there is a constant C (that depends on d) such that for ik < Clogn 
the above probability is o(l). Hence, with probability 1 — o(l) there is a path P £ I such that 
\6jjk {V{P))\ < k. The edge expansion of P therefore is k/i = k'^ /{Clogn) for i = C(logn)/k. D 

7 Splicers of random graphs 

We will show a random process on random graphs that generates random spanning trees with a 
distribution that is very close to the uniform distribution on the complete graph. The process first 
directs edges to mimic the distribution of a directed random graph. 

Given an undirected graph H and a parameter < p < 1, construct a random directed graph 
denoted Dp{H) with vertex set V{H) and independently for every edge (u, v) of H: 



edges (u, f) and (u,u) with probability 



-p-2^/T^+2 



• only edge (n, f) with probability ^ ^ ^ — , and 

• only edge {v,u) with probability p v -p- _ 

If H is random according to Gn,p, then Dp{H) is random with each edge picked with probability 
q = I — -y/1 —p. Note that p/2 < q < p. 

Let T be the uniform distribution on spanning trees of Kn- We now describe Process Bp, which 
is a random process that given an undirected graph H and a parameter < p < 1 generates a 
spanning tree with a distribution that we denote Tp^jj Consider the following random process that 
generates a walk in Dp{H) or stops with no output: 

1. Start at a vertex vq of Dp{H). 

2. At a vertex v, an edge is traversed as follows. Suppose di{v) out of d{v) outgoing edges at 
V are previously traversed. Then, the probability of picking a previously traversed edge is 
l/(n — 1) while the probability for each new edge is 

1 _ Mv) 

-^ 71-1 



d{v) — di{v) 



3. If all vertices have been visited, output the walk and stop. If this has not happened and at 
the current vertex v one has di{v) = d{v), stop with no output. 
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As in the random walk algorithm, the spanning tree given by Process Bp (if it succeeds in visiting 
all the vertices) is the set of edges that are used on first visits to each vertex, but the random 
sequence of edges is different here. 

A covering path of a graph is a path passing through all vertices. Let D be the distribution on 
covering paths of the (undirected) complete graph starting at a vertex vq where a random path is 
generated by a random walk that starts at vq and walks until it has visited all the vertices. Let Dp 
be the distribution on covering paths of the complete graph given by first choosing H according to 
Gn,'p and running Process Bp starting from vq. 

Lemma 7.1. There exists an absolute constant c such that for p > clogn/n the total variation 
distance between the distributions D and Dp is o(l). 

Proof. We will couple D and Dp so that the walk in D picks the same edges as the walk in Dp, 
but if Dp fails, then D continues its random walk. Then this covering walks coincide whenever 
Dp succeeds, and thus the probability of success is an upper bound to the total variation distance 
between D and Dp. Now, Dp does not fail if every vertex in Hii has out-degree at least ci logn and 
Process Bp does not visit any vertex more than C2 logn times, for ci > C2. A Chernoff bound gives 
c and ci such that the first part happens with probability 1 — o(l). For the second part, we observe 
that if there is no failure then Procedure B behaves exactly like a random walk in the complete 
graph, and therefore it visits all vertices in at most 0371 logn steps with probability 1 — o(l) for some 
constant C3 (this is essentially the coupon collector's problem with n — 1 coupons, see [T71 Section 
3.6 and Chapter 6]) and a walk of that length does not visit any vertex more than C2logn times 
with probability 1 — o(l) for some constant C2 (by a straightforward variation of the occupancy 
problem in [iTl Section 3.1]). D 

Let Tp be the distribution on trees obtained by first choosing H from Gn,p and then generating 
a random spanning tree according to Process Bp. 

Lemma 7.2. There exists an absolute constant c such that for p > clogn/n the total variation 
distance between the distributions T and Tp is o(l). 

Proof. This is immediate from Lemma [7. H as random trees from T or Tp are just functions of walks 
from D or Dp, respectively. D 



Proof (of Theorem \1.4l l. In the random graph H, we generate two random trees by using one long 
sequence of edges, with a breakpoint whenever we complete the generation of a spanning tree. In 
the complete graph also, we generate two trees from such a sequence obtained from the uniform 
random walk. Using the same coupling as in Lemma 17.21 we see that these distributions on these 
sequences have variation distance o(l). Therefore the spanning trees of H obtained by the first 
process have total variation distance o(l) to random spanning trees of the complete graph. By 
Theorem 11.31 the union of these trees has constant expansion with probability 1 — o(l) overall. D 

With this results we are ready to prove our theorem about sparsifiers of random graphs: 

Proof (of Theore7n \1.5\) . We need the fact that for sufficiently large constant C, with probability 
1 — 0(1), all cuts Sni^) in random graph H satisfy 

C3p\A\{n - \A\) < \6h{A)\ < C4p\A\{n - \A\). 
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This is well-known and follows immediately from appropriate Chernoff-type bounds. 

We only need to prove the theorem for |yl| < n/2. We now prove the first inequality in the 
statement of the theorem. By Theorem 11.31 with probability 1 — o(l), for any A C V such that 
1^1 < n/2, we have \Sh'{A)\ > C5\A\, and so w{6h'{A) > C5\A\pn > C5p\A\{n - \A\) > ^\6h{A)\. 

For the second inequality in the statement of the theorem, we need the fact that the maximum 
degree of a vertex in a random spanning tree in the complete graph is O(logn). So the same holds 
for random spanning trees generated by process Bp. We then have |(5i/'(74)| < C6logn|A|, and so 
w{6H'iA)) < celogn\A\pn < ^logn\6H'{A)\. D 



C2 



8 Discussion 

The problem of scalable routing in the presence of failures has motivated a novel construction of 
sparse expanders. The use of trees is particularly natural for routing. Our results suggest using 
a constant number of trees in total for routing, as opposed to the norm of one or more trees per 
destination. Further, the manner in which the trees are obtained is simple to implement and can 
lead to faster recovery since (a) paths exist after several failures and (b) fewer trees need to be 
recomputed in any case. 

One aspect of splicers that we have not fully explored is the stretch of the metric induced by 
them. For the case of the complete graph, it is not hard to see that the diameter is O(logn) and 
hence so is the expected stretch for a pair of random vertices. This continues to hold for Gn,p, 
in fact giving better bounds for small p (expected stretch of O(loglogn) for p = poly (log n)/n). 
It remains to study the stretch of splicers for arbitrary graphs or bounded-degree graphs. This 
seems to be an interesting question since on the complete graph, the expected stretch on one tree 
is 0(-y/n) while that of two trees is O(logn). 

Finally, Process Bp appears interesting to study on its own. 
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